A parallel Clustering algorithm based on minimum spanning tree for microarrays data analysis
نویسندگان
چکیده
Clustering is partitioning a set of observation into groups called clusters, where the observation in the same group has a common characteristic. One of the best known algorithms for solving the microarrays data clustering problem using minimum spanning tree (MST) is CLUMP algorithm (Clustering algorithm through MST in Parallel) which identifies a dense clusters in a noisy background. The MST construction phase of the CLUMP is the time consuming phase. This paper presents an improved version of CLUMP algorithm called iCLUMP (improved Clustering algorithm through MST in Parallel). iCLUMP enhances the speedup of MST construction using the cover tree data structure. The implementation shows that iCLUMP is efficient than CLUMP in terms of complexity and runtime. Key-Words: Clustering; Minimum spanning tree; Microarrays; Bioinformatics, Parallel algorithm.
منابع مشابه
hiCLUMP : A hybrid Implementation of the CLUMP Algorithm for Clustering Microarrays Data
Microarrays technology allows us to measure the expression level of hundreds of thousands of genes simultaneously. The microarrays data analysis process involves various heavy computational tasks such as clustering. The clustering can be defined as partitioning a dataset into groups where objects in the same group are similar in somehow. CLUMP (clustering through MST in parallel) is one of the ...
متن کاملAn Efficient Parallel Data Clustering Algorithm Using Isoperimetric Number of Trees
We propose a parallel graph-based data clustering algorithm using CUDA GPU, based on exact clustering of the minimum spanning tree in terms of a minimum isoperimetric criteria. We also provide a comparative performance analysis of our algorithm with other related ones which demonstrates the general superiority of this parallel algorithm over other competing algorithms in terms of accuracy and s...
متن کاملClassification of encrypted traffic for applications based on statistical features
Traffic classification plays an important role in many aspects of network management such as identifying type of the transferred data, detection of malware applications, applying policies to restrict network accesses and so on. Basic methods in this field were using some obvious traffic features like port number and protocol type to classify the traffic type. However, recent changes in applicat...
متن کاملAn Adaptive Parallel Hierarchical Clustering Algorithm
Clustering of data has numerous applications and has been studied extensively. It is very important in Bioinformatics and data mining. Though many parallel algorithms have been designed, most of algorithms use the CRCW-PRAM or CREW-PRAM models of computing. This paper proposed a parallel EREW deterministic algorithm for hierarchical clustering. Based on algorithms of complete graph and Euclidea...
متن کاملA Metaheuristic Algorithm for the Minimum Routing Cost Spanning Tree Problem
The routing cost of a spanning tree in a weighted and connected graph is defined as the total length of paths between all pairs of vertices. The objective of the minimum routing cost spanning tree problem is to find a spanning tree such that its routing cost is minimum. This is an NP-Hard problem that we present a GRASP with path-relinking metaheuristic algorithm for it. GRASP is a multi-start ...
متن کامل